Skip to content

Dynamic WHT rounds in TurboQuant#7330

Merged
connortsui20 merged 3 commits intodevelopfrom
ct/sorf
Apr 8, 2026
Merged

Dynamic WHT rounds in TurboQuant#7330
connortsui20 merged 3 commits intodevelopfrom
ct/sorf

Conversation

@connortsui20
Copy link
Copy Markdown
Contributor

@connortsui20 connortsui20 commented Apr 7, 2026

Summary

Tracking issue: #7297

Adds the ability to have a dynamic number of rounds of FWHT rounds for the SORF algorithm. Previously, this was just hardcoded to 3.

Also fixes a small validation bug.

Testing

Existing tests suffice, and then added some regression tests for the validation bug.

@connortsui20 connortsui20 requested a review from lwwmanning April 7, 2026 22:34
@connortsui20 connortsui20 added the changelog/feature A new feature label Apr 7, 2026
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq bot commented Apr 7, 2026

Merging this PR will not alter performance

✅ 1122 untouched benchmarks
⏩ 1530 skipped benchmarks1


Comparing ct/sorf (2b54123) with develop (bbb371c)

Open in CodSpeed

Footnotes

  1. 1530 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
@connortsui20 connortsui20 marked this pull request as ready for review April 8, 2026 14:40
@connortsui20 connortsui20 enabled auto-merge (squash) April 8, 2026 14:40
@connortsui20 connortsui20 merged commit 2e00050 into develop Apr 8, 2026
59 of 60 checks passed
@connortsui20 connortsui20 deleted the ct/sorf branch April 8, 2026 14:42
// Norms dtype must match the element ptype of the Vector, with the parent's nullability.
// Norms carry the validity of the entire TurboQuant array.
let element_ptype = vector_metadata.element_ptype();
let expected_norms_dtype = DType::Primitive(element_ptype, dtype.nullability());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noooo, we want these to be same or wider. E.g., I think for f16, norms should be f32.

Copy link
Copy Markdown
Contributor Author

@connortsui20 connortsui20 Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think we want that as that is an implicit cast. If you wanted to store f16 with f32 precision you should upcast it first (not inside this quantization encoding)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh actually if multiplication overflows maybe there is an argument for upcasting? But then we might as well just always store norms as f64 because theoretically anything can overflow with enough large elements and lots of dimensions... So I would rather just keep the behavior as is

.iter()
.map(|&v| if v != 0 { 0u32 } else { F32_SIGN_BIT })
.collect();
// The storage is in inverse application order: round k-1 first, then k-2, ..., 0.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any point in keeping it in inverse order?

Copy link
Copy Markdown
Contributor Author

@connortsui20 connortsui20 Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No not really, that's just how the previous implementation did it.

Given that we need both orders I don't think it matters, I can change it in a followup PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/feature A new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants